Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kernel] Initial commit containing new Triton kernels for multi lora serving. #5025

Closed
wants to merge 3 commits into from

Conversation

FurtherAI
Copy link
Contributor

SGMV Triton Kernels

New Triton kernels for multi lora computation. These (should) handle any shape and data type, apply to loras in a paged format and compute at the actual lora rank and also speed up for grouped lora requests (especially prefill).

The PR contains the kernels, tests for the kernels and benchmarks. A follow up PR will work on adding the ability to use these in vLLM.

ping @Yard1

…computation. These (should) handle any shape and data type, apply to loras in a paged format and compute at the actual lora rank and also speed up for grouped lora requests (especially prefill).
@FurtherAI
Copy link
Contributor Author

@Yard1 Any idea why the import fails? Works locally, but the kernel is in a new folder so maybe that path has to be added somewhere?

@Yard1
Copy link
Collaborator

Yard1 commented May 24, 2024

@FurtherAI add __init__.py to the new folder

@FurtherAI
Copy link
Contributor Author

@Yard1 Is there a way to rerun the tests without an empty commit?

@Yard1
Copy link
Collaborator

Yard1 commented May 29, 2024

@FurtherAI Making an empty commit with git commit --allow-empty -m "Trigger CI" is the way to go

@tensimixt
Copy link

@FurtherAI Does this allow for larger vocabulary Sizes? For example NeMO-12B has a vocab size of 131072
If i run this version of vllm from the pull request with LoRA enabled = True will it still say

When using LoRA, vocab size must be "32000 >= vocab_size <= 128512"

@FurtherAI
Copy link
Contributor Author

@tensimixt Yeah I think it does. It shouldn't have any issues with different sizes. I'll test it at some point

Copy link

This pull request has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this pull request should remain open. Thank you!

@github-actions github-actions bot added the stale Over 90 days of inactivity label Oct 26, 2024
Copy link

This pull request has been automatically closed due to inactivity. Please feel free to reopen if you intend to continue working on it. Thank you!

@github-actions github-actions bot closed this Nov 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Over 90 days of inactivity
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants